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CONFIDENTIAL 



>CPN100686 protein- export membrane protein SecD 

VSSPILNVPLKNHASVSGKFTHREVSKLASDLKSGAMSFVPEVLSEETISSDLGKKQCTQGIISACCGLAMLIVL 
MSVTYRFGGVIASGAVLLNLLLIWAALQYLDAPLTLSGLAGIVLAMGMATO^ 

KG YTK AFGA I F D S NLTTVL AS ALL F F LDTG P I KGF ALTL I LG I F S SMFT AL FMTK F F FMLWMNKTQHTQL H MMNK 

FVGIKHDFLRGCKKLWAVSGSVFLLGCVALGFGAWNSVLGMDFKGGYAFTFNPKEHGISDVAQMRGKWHKLQEA 

GLSSRDFRIQTFGSSEKIKIYFSDKSFKLYXSRYEPLSXNXRSXAGVNVGLLSETGLDFSTETLNETQNFWSKVS 

SKLSKKMRYQATIGLLGALAIILLYVSLRFEWQYAFSAVCALIHDLLATCAVLFIAHFFLKKIQIDLQAIGALMT 

VLGYSLNOTLIIFDRIREDRQAIOLFTPMHVLVNDALQKTFSRTVMTTATTLS 

ILLGTLSSLYIAPPLLL 



ACTUAL ENCODED SEQUENCE 



1 


MVSSPILNVP 


LKNHASVSGK 


FTHREVSKLA 


SDLKSGAMSF 


VPEVLSEETI 


51 


SSDLGKKQCT 


QCIISACCCL 


AMLIVLMSVY 


YRFGGVIASG 


AVLLNLLLIW 


101 


AALQYLDAPL 


TLSGLAGIVL 


AMGMAVDANV 


LVFERIREEF 


LLSQSLKKSV 


151 


EKGYTKAFGA 


IFDSNLTTVL 


ASALLFFLDT 


G PI KGF ALTL 


ILGIFSSMFT 


201 


ALFMTKFFFM 


LWMNKTQHTQ 


LHMMNKFVGI 


KHDFLRGCKK 


LWAVSGSVFL 


251 


LGCVALGFGA 


WNSVLGMDFK 


GGYAFTFNPK 


EHG I SDVAQM 


RGKWHKLQE 


301 


AGLSSRDFRI 


QTFGSSEKIK 


IYFSDKALSY 


TKQIRASLLK 


LTIMSWRYCG 


351 


IWRNRPRFL 


YGNSKRNAKF 


WSKVSSKLSK 


KMRYQATIGL 


LGALAI ILLY 


401 


VSLRFEWQYA 


FSAVCALIHD 


LLATCAVLFI 


AHFFLKKIQI 


DLQAIGALMT 


451 


VLGYSLNNTL 


IIFDRIREDR 


QANLFTPMHV 


LVNDALQKTF 


SRTVMTTATT 


501 


LSVLLMLLFI 


GGSSVFNFAF 


IMTIGILLGT 


LSSLYIAPPL 


LLFMVRKENR 


551 


SK* 












CODING SEQUENCE 

THE PROTEIN IS ENCODED ON THE POSITIVE STRAND 
The ATG is presumably the start codon 
The TAA is presumably the stop codon 



1 


ATGGACTTCC 


GCATATTGTC 


AGGAGGGGAT 


CAGCGGCACT 


GCTAATGGAC 


51 


AATATTCTGC 


AAACCGTGGA 


TGGCGTATGG 


CTGTAGTGAT 


TG AC GGTT AT 


101 


AggGTCAGCA 


GCCCTATTTT 


AAACGTCCCA 


TTGAAAAATC 


ATGCCAGTGT 


151 


CTCAGGGAAA 


TTTACCCACC 


GTGAAGTGAG 


CAAACTCGCC 


TCAGATTTAA 


201 


AATCTGGAGC 


GATGTCTTTT 


GTTCCCGAGG 


TTCTCAGTGA 


AGAGACGATC 


251 


TCTTCTGATC 


TTGGGAAAAA 


ACAATGTACA 


CAAGGCATTA 


TCTCAGCATG 


301 


CTGTGGCTTG 


GCAATGCTTA 


TTGTTTTGAT 


GAGCGTATAT 


TATAGATTTG 


351 


GAGGCGTCAT 


CGCTTCGGGA 


GCTGTTCTTC 


TGAATCTTTT 


GCTTATCTGG 


401 


GCAGCTCTAC 


AGTATTTGGA 


TGCGCCACTC 


AC CTTGTC AG 


GACTCGCTGG 


451 


GATTGTTCTT 


GCTATGGGGA 


TGGCCGTAGA 


TGCAAATGTT 


CTTGTATTCG 


501 


AAAGAATCCG 


AGAGGAATTT 


TTATTGTCTC 


AAAGTCTTAA 


AAAATCTGTA 


551 


GAAAAAGGAT 


ATACCAAGGC 


TTTTGGAGCC 


ATTTTTGATT 


CTAACTTGAC 


601 


TACAGTATTG 


GCCTCAGCAC 


TTCTTTTCTT 


C C TAG AT AC A 


GGGCCTATTA 


651 


AAGGGTTTGC 


TTTGACATTG 


ATTTTAGGAA 


TTTTCTCTTC 


AATGTTTACG 


701 


GCTCTTTTCA 


TGACTAAATT 


TTTCTTCATG 


CTGTGGATGA 


ATAAGACCCA 


751 


ACATACACAG 


TTGC AT ATGA 


TGAATAAGTT 


CGTGGGGATA 


AAGCATGATT 


801 


TCTTGAGAGG 


ATGCAAAAAA 


CTTTGGGCTG 


TTTCTGGAAG 


TGTTTTTCTT 


851 


TTAGGTTGCG 


TTGCTCTCGG 


GTTTGGAGCC 


TGGAATTCCG 


TTTTGGGAAT 


901 


GGATTTTAAA 


GGAGGGTATG 


CCTTTACCTT 


TAATCCAAAA 


GAGCATGGCA 


951 


TCAGCGATGT 


TGCTCAAATG 


CGTGGCAAAG 


TTGTGC AT AA 


AC T AC AGG AA 


1001 


GCTGGTCTTT 


C TTC TAG AG A 


CTTCCGTATT 


CAAACATTTG 


GATCTTCAGA 


1051 


AAAGATCAAA 


ATCTATTTTA 


GTGATAAAGC 


TTTAAGCTAT 


ACTAAGCAGA 


1101 


TACGAGCCTC 


TCTCCTAAAA 


TTAACGATCA 


TGAGCTGGCG 


TTATTGTGGG 


1151 


ATTGTTGTCA 


GAAACAGGCC 


TAGATTTCTC 


TACGGAAACT 


C T AAAC G AAA 


1201 


CGCAAAATTT 


TGGTCAAAGG 


TAAGCAGCAA 


ACTATCGAAG 


AAAATGCGTT 


1251 


ATCAGGCGAC 


CATCGGGCTT 


TTAGGAGCTT 


TGGCAATCAT 


CTTGCTCTAT 


1301 


GTGAGTTTGC 


GCTTTGAATG 


GCAATATGCT 


TTCAGTGCCG 


TATGCGCTTT 


1351 


AATTCATGAC 


CTTTTGGCTA 


CCTGTGCAGT 


C TTGTTT AT A 


GCACATTTCT 


1401 


TTTTGAAGAA 


AATTC AAAT A 


GATTTGCAAG 


CCATTGGTGC 


TTTAATGACT 


1451 


GTATTGGGGT 


ATTC ATT AAA 


CAATACTTTG 


ATCATTTTTG 


ATCGTATTCG 



1501 TGAAGATCGC CAAGCGAACC 

1551 ATGCCCTTCA AAAGACGTTC 

1601 CTATCAGTTT TGTTAATGCT 

1651 TTTTGCATTT ATTATGACCA 

1701 TTTATATTGC ACCACCTCTG 

17 51 TCAAA ATAA G TACCGTTAAA 

1801 TCCTTTGGGA CTTTAGTCCC 

1851 AATTCAGATA ATGC 



TGTTTACCCC TATGCATGTT TTAGTTAATG 

AGCCGCACGG TAATGACAAC AGCTACAACT 

TTTGTTTATA GGCGGCTCCT CTGTCTTTAA 

TAGGGATTCT TCTAGGAACT TTATCGTCTC 

TTGTTGTTTA TGGTCCGTAA AGAAAATCGC 

CTTAATCTAA CGTGTAGCAA TATAAAAATC 

AAAGGCCCCT GTGGTATTAA ATTTATGACA 



SEQUENCE ALIGNMENT 



101 ATGGTCAGCAGCCCTATTTTAAACGTCCCATTGAAAAATCATGCCAGTGT 150 

i r 1 1 j 1 1 1 1 1 1 1 1 1 1 E i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 [ l ! 1 1 j I 

1 MetValSerSerProIleLeuAsnValProLeuLysAsnHisAlaSerVa 17 
151 CTCAGGGAAATTTACCCACCGTGAAGTGAGCAAACTCGCCTCAGATTTAA 2 00 

I I M I II MINN Ml II MIIIIIIMIIMil I MIIMI II II M I 

18 lSerGlyLysPheThrHisArgGluValSerLysLeuAlaSerAspLeuL 34 

2 01 AATCTGGAGCGATGTCTTTTGTTCCCGAGGTTCTCAGTGAAGAGACGATC 2 50 

M I MM I I MM Ml MM I SMMM I II II I I M I I MM I M I I I I 

35 ysSerGlyAlaMetSerPheValProGluValLeuSerGluGluThrlle 50 
251 TCTTCTGATCTTGGGAAAAAACAATGTACACAAGGCATTATCTCAGCATG 3 00 

IMIMIIIIIMMIMMIIMIMIMIMIIIIIIIIII IIIMM 

51 SerSerAspLeuGlyLysLysGlnCysThrGlnGlyllelleSerAlaCy 67 

3 01 CTGTGGCTTGGCAATGCTTATTGTTTTGATGAGCGTATATTATAGATTTG 3 50 

M Ml MIIIIIIIMIII Ml MIMIIIIIIMII MIMI III I Ml 

68 sCysGlyLeuAlaMetLeuIleValLeuMetSerValTyrTyrArgPheG 84 

3 51 GAGGCGTCATCGCTTCGGGAGCTGTTCTTCTGAATCTTTTGCTTATCTGG 400 

II I MMM IIIMM I I MM III IIMIIMI MIIMIII Ml I III 

85 lyGlyVallleAlaSerGlyAlaValLeuLeuAsnLeuLeuLeuIleTrp 100 
401 GCAGCTCTACAGTATTTGGATGCGCCACTCACCTTGTCAGGACTCGCTGG 450 

M M I I I MMM II I I Ml I I Ml IMMMII MMM I II M I I M 

101 AlaAlaLeuGlnTyrLeuAspAlaProLeuThrLeuSerGlyLeiiAlaGl 117 

4 51 GATTGTTCTTGCTATGGGGATGGCCGTAGATGCAAATGTTCTTGTATTCG 500 

i j 1 1 1 : [ i 1 1 1 1 1 1 1 1 3 1 1 1 1 i 1 1 i 1 1 1 j [ i j 1 1 1 1 1 1 1 1 ! i i ; 1 1 ; i r i 

118 YlleValLeuAlaMetGlyMetAlaValAspAlaAsnValLeuValPheG 134 
501 AAAGAATCCGAGAGGAATTTTTATTGTCTCAAAGTCTTAAAAAATCTGTA 550 

i 1 1 i 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 ] 1 1 1 1 ! 1 1 1 [ 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 

135 luArglleArgGluGluPheLeuLeuSerGlnSerLeuLysLysSerVal 150 
551 GAAAAAGGATATACCAAGGCTTTTGGAGCCATTTTTGATTCTAACTTGAC 600 

II II I II II II I II MM I II Ml III MM I M II MM Ml II II III 

151 GluLysGlyTyrThrLysAlaPheGlyAlallePheAspSerAsnLeuTh 167 
601 TACAGTATTGGCCTCAGCACTTCTTTTCTTCCTAGATACAGGGCCTATTA 650 

II MMM Ml MM MM M ill IMMIMI III IIMill M I III 

168 rThrValLeuAlaSerAlaLeuLeuPhePheLeuAspThrGlyProIleL 184 
6 51 AAGGGTTTGCTTTGACATTGATTTTAGGAATTTTCTCTTCAATGTTTACG 700 

M I I II II Ml Ml II II! II I II Ml MM IM I II II II II II I I II I 

185 ysGlyPheAlaLeuThrLeuIleLeuGlyllePheSerSerMetPheThr 200 
701 GC TC TTTTC ATG AC T AAATTTTTC TTC ATGC TGTGG ATGAATAAG AC CCA 750 

I I I I I II II II M I Ml II IMMIMI I I Mill MMM MM M I I I 

201 AlaLeuPheMetThrLysPhePhePheMetLeuTrpMetAsnLysThrGl 217 
751 ACATACACAGTTGCATATGATGAATAAGTTCGTGGGGATAAAGCATGATT 800 

M I M I M I I I I I MM II I MIMMII I MM I II II MM I M I II 

218 nHisThrGlnLeuHisMetMetAsnLysPheValGlylleLysHisAspP 234 
801 TCTTGAGAGGATGCAAAAAACTTTGGGCTGTTTCTGGAAGTGTTTTTCTT 850 

MIIIIIIIIIIIIIIMMIIMIMMIIIIIIIMIIIIIIIII Ml 

23 5 heT,euArgGlyCysT.ysT..ysbGtiTrpAlaValSerGlySerValPheLeu 2 50 



851 TTAGGTTGCGTTGCTCTCGGGTTTGGAGCCTGGAATTCCGTTTTGGGAAT 900 

1 1 M 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 [ I < 1 1 1 1 1 1 1 1 1 1 1 1 1 1 j ; 1 1 

2 51 LeuGlyCysValAlaLeuGlyPheGlyAlaTrpAsnSerValLeuGlyMe 2 67 

9 01 GGATTTTAAAGGAGGGTATGCCTTTACCTTTAATCCAAAAGAGCATGGCA 950 

III I II I I I I! I I II I I I I I I II I MM I I I I I II II I II: I II I Ml I I 

268 tAspPheLysGlyGlyTyrAlaPheThrPheAsnProLysGluHisGlyl 284 
951 TCAGCGATGTTGCTCAAATGCGTGGCAAAGTTGTGCATAAACTACAGGAA 1000 

M M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 1 1 

285 leSerAspValAlaGlnMetArgGlyLysValValHisLysLeuGlnGlu 300 
1001 GCTGGTCTTTCTTCTAGAGACTTCCGTATTCAAACATTTGGATCTTCAGA 1050 

I I I Ml 1 1 1 I M III II I 1 1 1 II I MMMM I Ml I II M I M 1 1 II II 

301 AlaGlyLeuSerSerArgAspPheArglleGlnThrPheGlySerSerGl 317 
1051 AAAGATCAAAATCTATTTTAGTGATAAAGCTTTAAGCTATACTAAGCAGA 1100 

MMMMIMMMMMMMMMIMMMMMIMI lllllll 

318 uliysIleLysIleTyrPheSerAspLysAlaLeuSerTyrThrLysGlnl 334 
1101 TACGAGCCTCTCTCCTAAAATTAACGATCATGAGCTGGCGTTATTGTGGG 1150 

1 1 1 1 1 1 i ( 1 1 1 1 1 1 1 i r 1 1 1 1 1 1 1 j i 1 1 1 1 1 1 1 1 1 r 1 1 1 j 1 1 j 1 1 1 r r i f 

33 5 leArgAlaSerLeuLeuLysLeuThrlleMetSerTrpArgTyrCysGly 350 
1151 ATTGTTGTC AG AAAC AGGC CTAGATTTC TCTAC GGAAACTCTAAACGAAA 12 00 

I .MM I Ml M I I I IMIMI II II I II III III M III II Mill III 

351 IleValValArgAsnArgProArgPheLeuTyrGlyAsnSerLysArgAs 3 67 
1201 CGCAAAATTTTGGTCAAAGGTAAGCAGCAAACTATCGAAGAAAATGCGTT 12 50 

II I M I I II M M I II II I I II II M Ml II M I M M MM III 1 1 II I 

3 68 nAlaLysPheTrpSerLysValSerSerLysLeuSerLysLysMetArgT 3 84 
1251 ATCAGGCGACCATCGGGCTTTTAGGAGCTTTGGCAATCATCTTGCTCTAT 13 00 

10 II I IMI I II I IIMMIM I II I IMIMI II 1 1 II II II I II Mil If 

3 85 yrGlnAlaThrlleGlyLeuLeuGlyAlaLeuAlallelleLeuLeuTyr 400 
13 01 GTGAGTTTGCGCTTTGAATGGCAATATGCTTTCAGTGCCGTATGCGCTTT 13 50 

MMIIIMIIIIMIMIIMMMIIMIIIMIIMII III Mill 

401 ValSerLeuArgPheGluTrpGlnTyrAlaPheSerAlaValCysAlaLe 417 
13 51 AATTCATGACCTTTTGGCTACCTGTGCAGTCTTGTTTATAGCACATTTCT 1400 

M 1 1 1 ! I M I 1 1 1 1 1 1 1 1 1 1 1 M I M I 1 1 II I II I 1 1 1 M I 1 1 1 1 1 1 1 1 1 

418 uIleHisAspLeuLeuAlaThrCysAlaValLeuPhelleAlaHisPheP 434 
1401 TTTTGAAGAAAATTCAAATAGATTTGCAAGCCATTGGTGCTTTAATGACT 1450 

II Ml I li II 1 1 II 1 1 Mill 1 1 MM! Ml MM MM II Ml I MM! 

43 5 heLeuLysLysIleGlnlleAspLeuGlnAlalleGlyAlaLeuMetThr 45 0 
1451 GTATTGGGGTATTCATTAAACAATACTTTGATCATTTTTGATCGTATTCG 1500 

I M II 1 1 1 II I Ml I! II I I I I I 1 1 II III I MM 1 1 II I M IM MM i 

451 ValLeuGlyTyrSerLeuAsnAsnThrLeuIlellePheAspArglleAr 467 
1501 TGAAGATCGCC AAGCGAACCTGTTTACCCCTATGC ATGTTTTAGTTAATG 15 50 

II MUM II Mill Ml II 1 1 1 Mill III MMIMIIM II I Mill 

468 gGluAspArgGlnAlaAsnLeuPheThrProMetHisValLeuValAsnA 484 
1551 ATGCCCTTCAAAAGACGTTCAGCCGCACGGTAATGACAACAGCTACAACT 16 00 

„ o II MUM III MM II I II I I I I MM Ml MM MUM I Mill Ml 

485 spAlaLeuGlnLysThrPheSerArgThrValMetThrThrAlaThrThr 500 
1601 C T ATC AGTTTTGTT AATGC TTTTGTTT AT AGGC GGC TC C TC TGTC TTT AA 1650 

cni t E 1 1 ! i 1 1 1 1 1 1 f 1 1 ( 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 ( f 

501 LeuSerValLeuLeuMetLeuLeuPhelleGlyGlySerSerValPheAs 517 



1651 TTTTGC ATTT ATT ATG AC C ATAGGGATTC TTC T AGG AAC TTT ATC GTC TC 17 00 

1 1 1 1 1 1 1 ! 1 1 1 [ 1 1 [ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r [! 1 1 1 1 1 1 [ 1 1 1 1 1 1 1 1 1 

518 nPheAlaPhelleMetThrlleGlylleLeuLeuGlyThrLeuSerSerL 534 
1701 TTTATATTGCACCACCTCTGTTGTTGTTTATGGTCCGTAAAGAAAATCGC 1750 

1 II I I M I 1 1 M ! 1 1 I i IE 1 1 M I 1 1 1 i M 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 I 

53 5 euTyrlleAlaProProLeuLeuLeuPheMetValArgLysGlxxAsnArg 550 

1751 TCAAAA 1756 

MUM 
551 SerLys 552 



RESTRICTION MAP 



Acil 



Hpyl78lII 
Mnll |Sau3AI 



Alwl 
Fnu4HI 
Taul 
Acil 
MspAlI 
Dpnl 
Cjel 



TspRI 
BtSl I 



Cjel CjePI 
Sspl CviRI | 

I 



ATGGACTTCCGCATATTGTCAGGAGGGGATCAGCGGCACTGCTAATGGACAATATTCTGC 
1 + + + + + + 6Q 

T AC C TG AAGGC GT AT AAC AGTC C TC CC C T AGTC GC C GTG AC G ATT AC C TGTT AT AAGAC G 

Taal CviJI 
BsaJT Fokl BsmFI Bbvl 
BstDSI Sfcl Taal Fnu4HI Dral 
Cjel I Bed CviJI I Cjel CjePI I Tsel I Msel I 
I I I II I I II || 
AAACC GTGGATGGCGTATGGC TGTAGTGATTGAC GGTTATATGGTC AGC AGCCC TATTTT 
6 1 + + + + + + 12Q 

TTTGGCACCTACCGCATACCGACATCACTAACTGCCAATATACCAGTCGTCGGGATAAAA 



Maell 



BsrI 
MslI 
Nlalll I 



Apol 
Tsp509l 
BsmAI 
Ddel 

TspRI 



BseMII Taal 



AAACGTCCCATTGAAAAATCATGCCAGTGTCTCAGGGAAATTTACCCACCGTGAAGTGAG 

121 + + + + + + 180 

TTTGCAGGGTAACTTTTTAGTACGGTCACAGAGTCCCTTTAAATGGGTGGCACTTCACTC 



Hpyl78III 
BseMII ' 
Mnll 
Dral 



BsmAI 
BsmBI 
Earl 
Ddel 
Sthl32l 
Bpml 
BsaJI 
Hpyl78lII 
Aval 



HaelV 
Hpyl88lX Hin4l 

Ddel I Msel I Mnll 
I 

CAAACTCGCCTCAGATTTAAAATCTGGAGCGATGTCTTTTGTTCCCGAGGTTCTCAGTGA 

181 + + + + + + 240 

GTTTGAGCGGAGTCTAAATTTTAGAC CTC GCTAC AGAAAAC AAGGGCTCC AAGAGTC AC T 

Dpnl 
Earl 
Sau3AI 
Hpyl88IX 
Mboll 
Dpnl 
BseMII ' 
Hin4I 
Sau3AI 
Mboll 
TspRI | 

I 



Rsal 
BsrGI 
Tat I 



Nlalll 
Nspl 
SphI 

Ddel Cac8I | 
I 



BplI 

I ... 
AGAGACGATCTCTTCTGATCTTGGGAAAAAACAATGTACACAAGGCATTATCTCAGCATG 
241 + + + + + + 300 

TCTCTGCTAGAGAAGACTAGAACCCTTTTTTGTTACATGTGTTCCGTAATAGAGTCGTAC 



Acelll 
Sthl32I 

BseMII Mnll Hin4l 

Cvi JI BsrDT Hgal I BsaHI I 

II ll ll 
CTGTGGCTTGGCAATGCTTATTGTTTTGATGAGCGTATATTATAGATTTGGAGGCGTCAT 
301 + + + + + + 

GACACCGAACC GTT AC G AATAAC AAAAC T AC TC GC AT AT AAT ATC TAAAC C TC C GC AGT A 



Alul 
CviJI 
Mboll 
Mwol 
Hpyl78III | 



Acelll 
Bbvl 
Taal 
SfaNI 
Sfcl 



361 



Hinf I 
Tf il 
Hpyl88lX I 
I I 

CGCTTCGGGAGCTGTTCTTCTGAATCTTTTGCTTATCTGGGCAGCTCTACAGTATTTGGA 



Alul 
CviJI 
Fnu4HI 
Tsel 
Cjel I 



GC G AAGC C C TC G AC AAGAAG AC TTAG A&AAC G AATAGAC C C GTC GAG ATGTC ATAAAC C T 



-+ 420 



Hhal 
HphI I 



Hinf I 
Hpyl78III 
Plel 
Cjel 



Fokl 
I 



Beef I 



CviJI 
Haelll 
Bed 
Eael 
Gdill 
SfaNI I 



Mwol 



421 



TGCGCCACTCACCTTGTCAGGACTCGCTGGGATTGTTCTTGCTATGGGGA'l^GGCCGTAGA 



ACGCGGTGAGTGGAACAGTCCTGAGCGACCCTAACAAGAACGATACCCCTACCGGCATCT 



-+ 480 



CviRI 
Fokl | 
I 



Hpyl88IX 
Mnll 
NspV Hinf I 
TaqI Tfil 



Apol 
Tsp509l 



BsmAI Ms el 

TGC AAATGTTCTTGTATTCGAAAGAATCCGAGAGGAATTTTTATTGTC TC AAAGTC TTAA 

481 + + + + + + 540 

ACGTTTACAAGAACATAAGCTTTCTTAGGCTCTCCTTAAAAATAACAGAGTTTCAGAATT 



CviJI 



Sfcl 
I 



BsaJI 
Sty I 



CviJI 
NlalV 
Mwol I 



Hinf I 
Tfil 



Sfcl 
I 



541 



AAAATCTGTAGAAAAAGGATATACCAAGGCTTTTGGAGCCATTTTTGATTCTAACTTGAC 



TTTTAGACATCTTTTTCCTATATGGTTCCGAAAACCTCGGTAAAAACTAAGATTGAACTG 



-+ 600 



BbvCI 
BpulOl 
Ddel 
CviJI 
Hael 
Taal Haelll 
I 



BseMII 
Mnll 
Mboll I 



CviJI 
Haelll 
EcoO109l 
Bfal Sau96I 



BslI 



EcoNI 
Msel 
I 



TACAGTATTGGCCTCAGCACTTCTTTTCTTCCTAGATACAGGGCCTATTAAAGGGTTTGC 

601 + + + + + + 660 

ATGTC AT AACC GGAGTCGTGAAGAAAAGAAGGATC TATGTC CCGGATAATTTCCC AAACG 



661 



ApoT 
Tsp509l 

MboII 
Beef I 

Apol NT a III 

MboII Hpyl78III 
Tsp509I Earl CviJI Real I 

I I III 

TTTGAC ATTGATTTTAGGAATTTTCTCTTC A ATGTTTACGGCTCTTTTC ATGACTAAATT 



AAAC TGTAACT AAAATC C TT AAAAG AG AAGTT AC AAATGC CGAG AAAAGTAC TG ATTT AA 



-+ 720 



721 



Ndel 

Fokl CviRI 
Nlalll SimI I Taal I XmnI 

I II I I I 
TTTC TTC ATGC TGTGG ATGAATAAG AC C C AAC AT AC AC AGTTGC AT ATG ATG AAT AAGTT 
+ + + + h + 

AAAGAAGTACGACACCTACTTATTCTGGGTTGTATGTGTCAACGTATACTACTTATTCAA 



780 



Hpyl78III 
Smll 
Mnll 
SfaNI 
Nlalll I 



Hpyl78III 
CviJI 
Bce83I 



781 



CviRI Fokl I 

CGTGGGGATAAAGCATGATTTCTTGAGAGGATGCAAAAAACTTTGG^TGTTTCTGGAAG 
+ + + + + + 

GC AC C C C T ATTTC GT AC T AAAG AAC TCTC C T AC GTTTTTTG AAACC C G AC AAAG AC C TTC 



840 



Sthl32l 



Aval 



Apol 
EcoRI 
Tsp509l 
ScrFI 
CviJI 
EcoRI I 
NldlV I 



841 



II I. 
TGTTTTTCTTTTAGGTTGCGTTGCTCTCGGGTTTGGAGCCTGGAATTCCGTTTTGGGAAT 

+ + + h + + 

AC AAAAAG AAAATC C AAC GC AAC G AG AGC C C AAAC C TC GG AC C TT AAGGC AAAAC C C TT A 

Dral 
Msel 
Mnll I 



900 



Msel 



901 



Nlalll SfaNI 

I I I 
GGATTTTAAAGGAGGGTATGCCTTTACCTTTAATCCAAAAGAGCATGGCATCAGCGATGT 
+ + + + + + 

CCTAAAATTTCCTCCCATACGGAAATGGAAATTAGGTTTTCTCGTACCGTAGTCGCTACA 



960 



CviRI 



Sfcl 



MboII 
Alul 
CviJI 



Hpyl78lII 
Bfal 
Xbal 
BsmAI I 



TGC TC AAATGC GTGGCAAAGTTGTGC AT AAAC TAC AGGAAGC TGGTC TTTC TTC TAG AGA 

961 + + + + + + 1020 

AC G AGTTT AC GC AC C GTTTC AAC AC GTATTTG ATGTC C TTC G AC C AGAAAG AAG ATC TC T 



BsaBI 
Dpnl 
Sau3AI 
Alwl I 
Hpyl88lX| | 



1021 



Tthlllll 
Dpnl 

BstYI Alul 
Sau3AI CviJI 
Eco57l MboII | Hindlll 

CTTCCGTATTCAAACATTTGGATCTTCAGAAAAGATCAAAATCTATTTTAGTGATAAAGC 



GAAGGCATAAGTTTGTAAACCTAGAAGTCTTTTCTAGTTTTAGATAAAATCACTATTTCG 



-+ 1080 



Cac8I 
RleAI 
Alul 
CviJI 
Nlalll 
Hpyl78lII 
Real 
BplI 



Alul 
CviJI 
Msel 1 Ddel 



Hin4I 
CviJI I 



Dpnl 
Sau3AI 
Msel 
Acelll 



Tsp509I 
Mnll I 



1081 



1141 



TTTAAGCTATACTAAGC AGATACGAGCCTCTC TCCTAAAATTAAC GATC ATGAGCTGGCG 

+ n h + + + 

AAATTCGATATGATTCGTCTATGCTCGGAGAGAGGATTTTAATTGCTAGTACTCGACCGC 

Bfal 
CviJI 
Hael 
Haelll 
Hpyl88IX StuI 

I I 

TTATTGTGGGATTGTTGTCAGAAACAGGCCTAGATTTCTCTACGGAAACTCTAAACGAAA 



1140 



-+ 1200 



AATTACACCCTAACAACAGTCTTTGTCCGGATCTAAAGAGATGCCTTTGAGATTTGCTTT 



1201 



Bcgl 

Apol Fnu4HI Bbvl Sthl3 2I 

Tsp509I Tsell Taql! MboII I Bcgl 

i il il ill 

CGCAAAATTTTGGTCAAAGGTAAGCAGCAAACTATCGAAGAAAATGCGTTATCAGGCGAC 



GC GTTTT AAAAC C AGTTTC C ATTC GT C GTTTG AT AGC TTC TTTT AC GC AATAGTC C GC TG 



-+ 1260 



Bed CviJI 



Alul 
CviJI 



1261 



Hhal 
I 

CATCGGGCTTTTAGGAGCTTTGGCAATCATCTTGCTCTATGTGAGTTTGCGCTTTGAATG 
+ -i h + + + 

GTAGCCCGAAAATCCTCGAAACCGTTAGTAGAACGAGATACACTCAAACGCGAAACTTAC 



1320 



Nlalll 
Hpyl78lII 
Tsp509l 
Msel " 



TspRI Hhal 
Beef I Mwol |MwoI ! Real 

I I III ... 

GCAATATGCTTTCAGTGCCGTATGCGCTTTAATTCATGACCTTTTGGCTACCTGTGCAGT 

1321 + + + + + + 1380 

CGTTATACGAAAGTCACGGCATACGCGAAATTAAGTACTGGAAAACCGATGGACACGTCA 



CviRI 
CviJI Mwol I 



Apol 
Tsp509l MboII 



CviJI 
Cac8I 

Bsgl Tsp509I MboII CviRI I Mwol 

I II III 

CTTGTTTATAGCACATTTCTTTTTGAAGAAAATTCAAATAGATTTGCAAGCCATTGGTGC 

1381 + + + + + + 1440 

GAACAAATATCGTGTAAAGAAAAACTTCTTTTAAGTTTATCTAAACGTTCGGTAACCACG 



Dpnl 
Bell 
Sau3AI 



Dpnl 

Sau3AI |Hpyl78III 



Msel Taal Msel 

I I I 

TTTAATGACTGTATTGGGGTATTCATTAAACAATACTTTGATCATTTTTGATCGTATTCG 

1441 + + + + + + 1500 

AAATT AC TG AC ATAAC C C C ATAAGT AATTTGTT ATG AAACT AGT AAAAAC T AGC AT AAGC 



SfaNI 
Nlalll 
Nspl 

Dpnl Nsil 
Sau3Al I MboII CviRI I Msel 

ill ll 
TGAAGATCGCCAAGCGAACCTGTTTACCCCTATGCATGTTTTAGTTAATGATGCCCTTCA 

1501 + » + + + 1560 

ACTTCTAGCGGTTCGCTTGGACAAATGGGGATACGTACAAAATCAATTACTACGGGAAGT 



Maell 
I 



Acil 
Fnu4HI 
Taul 
CviJI 
I 



MslI 
Taal 



Alul 
CviJI 



Msel 



AAAGACGTTCAGCCGCACGGTAATGACAACAGCTACAACTCTATCAGTTTTGTTAATGCT 

1561 + + + + + + 1620 

TTTCTGCAAGTCGGCGTGCCATTACTGTTGTCGATGTTGAGATAGTCAAAACAATTACGA 



NlalV 
CviJI 
Fnu4HI 
Taul 
BseRI Acil | 



Mnll 
Tsp509l 
Msel I 



CviRI 
I 



CjePI 



Hinf I 
MboII Tfil 



TTTGTTTATAGGCGGCTCCTCTGTCTTTAATTTTGCATTTATTATGACCATAGGGATTCT 

1621 + + + + + + 1680 

AAAC AAAT ATC C GC C G AGG AG AC AG AAATT AAAAC GT AAAT AAT AC TGGTATC C C T AAG A 



BsmAI Avail 
Bfal CjePI BsmBI CviRI Mnll Sau96I 

I ill ii 

TCTAGGAACTTTATCGTCTCTTTATATTGCACCACCTCTGTTGTTGTTTATGGTCCGTAA 

1681 + + + + + + 1740 

AGATCCTTGAAATAGCAGAGAAATATAACGTGGTGGAGACAACAACAAATACCAGGCATT 



Msel 

Taal Afllll 
Rsal I Msel Maell 

il i i 

AGAAAATCGCTCAAAATAAGTACCGTTAAACTTAATCTAACGTGTAGCAATATAAAAATC 
1741 + + + + + + 1800 

TCTTTTAGCGAGTTTTATTCATGGCAATTTGAATTAGATTGCACATCGTTATATTTTTAG 



NlalV 
CviJI I 
Haelll | 



BsmFI 



PshAI 



EcoO109l 
Sau96l 
BsmFI I 



Apol 
Tsp509l 
Msel I 



Hpyl88lX 
Apol 
Tsp509I 



TCCTTTGGGACTTTAGTCCCAAAGGCCCCTGTGGTATTAAATTTATGACAAATTCAGATA 

1801 + + + + + + 1860 

AGGAAACCCTGAAATCAGGGTTTCCGGGGACACCATAATTTAAATACTGTTTAAGTCTAT 

ATGC 

1861 1864 

TACG 



Enzymes that do cut : 



Acelll 


Acil 


Af mi 


Alul 


Alwl 


Apol 


Aval 


Avail 


Bbvl 


BbvCI 


Bed 


Bce83I 


Beef I 


Bcgl 


Bell 


Bf al 


BplI 


Bpml 


BpulOI 


BsaBI 


BeaHI 


BsaJl 


B3e.MII 


BseRI 


Bsgl 


BslI 


BsmAI 


BsmBI 


BsmFI 


BsrI 


BsrDI 


BsrGI 


BstDSI 


BstYI 


Btsl 


Cac8I 


Cjel 


CjePI 


CviJI 


CviRI 


Ddel 


Dpnl 


Dral 


Eael 


Earl 


Eco57l 


EcoNI 


EcoO109l 


EcoRI 


EcoRI I 


Fnu4HI 


Fokl 


Gdill 


Hael 


Hael I I 


HaelV 


Hgal 


Hhal 


Hin4I 


Hindlll 


Hinf I 


HphI 


Hpyl78III 


Hpyl88I2 


Mae 1 1 


MboII 


Mill I 


Msel 


MslI 


MspAlI 


Mwol 


Ndel 


Nlalll 


NlalV 


Nsil 


Nspl 


NspV 


Plel 


PshAI 


Real 


RleAI 


Rsal 


Sau96l 


Sau3AI 


ScrFI 


Sf aNl 


Sfcl 


Siml 


Smll 


SphI 


Sspl 


Sthl32l 


StuI 


Styl 


Taal 


TaqI 


TatI 


Taul 


Tf il 


Tsel 


Tsp509l 


TspRI 


Tthlllll 


Xbal 


XmnI 
















Enzymes 


chat do not 


cut : 












Aarl 


Aatll 


ACCl 


Acll 


Af III 


Ahdl 


Alol 


AlwNI 


Apal 


ApaLI 


AscI 


Avrll 


Bael 


BamHI 


Ban I 


Ban 1 1 


Bbsl 


BciVI 


Bgll 


Bgll I 


Bmgl 


Bmrl 


Bpull02I 


Bsal 


BsaAI 


BsaWI 


BsaXI 


Bsbl 


BscGI 


BseSI 


BsiEI 


BsiHKAI 


BsmI 


Bsp24l Bspl286l 


BspEI 


BspGI 


BspLUllI 


BspMI 


BsrBI 


BsrFI 


BSSHII 


BSSSI 


BStAPI 


BstEII 


BStXI 


BstZl7l 


BSU3 6I 


Clal 


Drain 


DrdI 


DrdI I 


EagI 


Ecil 


Eco47lII 


EcoRV 


Faul 


Fsel 


Fspl 


Haell 


HgiEII 


Hindi 


Hpal 


Kpnl 


Maelll 


Mlul 


Mmel 


Msel 


Mspl 


Muni 


Narl 


Neil 


JMCOl 


NgoAlv 


Nhel 


wotl 


wrul 


Pad 


Pflll08I 


Pf 1MI 


PinAI 


Pmel 


Pmll 


Ppil 


PspSII 


PstI 


Pvul 


Pvul I 


RsrII 


SacI 


SacII 


Sail 


SanDI 


Sapl 


Sbf I 


Seal 


SexAI 


Sf il 


Sgf I 


SgrAI 


Smal 


SnaBI 


Spel 


Srf I 


Sse8647l 


SunX 


Swal 


Taqll 


Thai 


Tsp45I 


Tthllll 


Vspl 


Xcml 


Xhol 















>SW: SECF^MYCTU Q50635 mycobacterium tuberculosis, protein-export membrane 
protein secf . 11/97 
Length = 442 

Snore = 90 (42.3 bits), Expect = 5.9e-16, Sum P<5) = 5.9e-16 
Identities = 30/118 (25%), Positives = 50/118 (42%) 

Query: 316 SEKIKIYFSDKALSYTKQIRASLLKLTIMSWRX^ 375 

SE + ST QIR+ L + + P+G+ASVS 

Sbjct: 121 SEPQSWIVGAGASATVQIRSETLTSDQTAKLRDALFEAFGPKGTDGQPSKQAISDSAVS 180 



Query: 376 SKLSKKMRYQATIGLLGALAIILLYVSLRFEWQYAFSAVCALIHDLLATCAVLFIAHF 433 

+ + +A I L+ L ++ LY+++R+E SA+ A++ DL T V 4- P 

Sbjct: 181 ETWGGQITKKAVIALVVFLVLVALYITVRYERYMTISAITAMLFDLTVTAGVYSLVGF 238 



